Methods and systems to complete transaction date

ABSTRACT

A method and system to receive transaction data; determine a gap in the transaction data; and use an algorithm to generate data to fill in the gap is described. The algorithm is selected from a group including a first algorithm and a second algorithm. The first algorithm is to determine a dominant pattern in the transaction data; identify a region within the dominant pattern that corresponds to the gap in the transaction data; and adopt data associated with the corresponding region into the gap to minimize impact on the dominant pattern. The second algorithm includes a Moore-Penrose pseudo-inverse algorithm to choose the transaction data to fill in the gap based on a set of substitute data from among a group of substitute data sets and adopts the set of substitute data into the gap.

FIELD

The application relates generally to the field of transaction data, morespecifically the methods and systems to complete transaction data, andto a machine-readable medium comprising instructions to perform thismethod.

BACKGROUND

Automatic Call Distribution (ACD) centers often use forecasting modelsto forecast transactions (e.g, calls or other communication requests)during certain periods of time. The forecasting models may be useful indetermining adequate and efficient staff scheduling, for instance.Parameters for a forecasting model are often updated with new data toimprove forecasting accuracy. Often, such updating is tedious and timeconsuming for an administrator of the forecasting model.

SUMMARY

According to an aspect of the invention there is provided a method andsystem to receive transaction data; determine a gap in the transactiondata; and use an algorithm to generate data to fill in the gap isdescribed. The algorithm is selected from a group including a firstalgorithm and a second algorithm. The first algorithm is to determine adominant pattern in the transaction data; identify a region within thedominant pattern that corresponds to the gap in the transaction data;and adopt data associated with the corresponding region into the gap tominimize impact on the dominant pattern. The second algorithm includes aMoore-Penrose pseudo-inverse algorithm to choose the transaction data tofill in the gap based on a set of substitute data from among a group ofsubstitute data sets and adopts the set of substitute data into the gap.

BRIEF DESCRIPTION OF DRAWINGS

An example embodiment of the present invention is illustrated by way ofexample and not limitation in the figures of the accompanying drawings,in which like references indicate similar elements and in which:

FIG. 1 illustrates a system, according to an example embodiment of thepresent invention.

FIG. 2 illustrates a method of choosing an algorithm to fill in atransaction data gap, according to an embodiment.

FIG. 3 illustrates a method of implementing an algorithm, according toan example embodiment of the present invention.

FIG. 4 illustrates a method of implementing another algorithm, accordingto an example embodiment of the present invention.

FIG. 5 shows a diagrammatic representation of machine in the exampleform of a computer system within which a set of instructions, forcausing the machine to perform any one or more of the methodologiesdiscussed herein, may be executed.

DETAILED DESCRIPTION

According to an aspect of the invention there is provided a method andsystem to receive transaction data; determine a gap in the transactiondata; and use an algorithm to generate data to fill in the gap isdescribed. The algorithm is selected from a group including a firstalgorithm and a second althorithm. The first algorithm is to determine adominant pattern in the transaction data; identify a region within thedominant pattern that corresponds to the gap in the transaction data;and adopt data associated with the corresponding region into the gap tominimize impact on the dominant pattern. The second algorithm includes aMoore-Penrose pseudo-inverse algorithm to choose the transaction data tofill in the gap based on a set of substitute data from among a group ofsubstitute data sets and adopts the set of substitute data into the gap.

Architecture

FIG. 1 illustrates a system 100, according to an example embodiment ofthe present invention. The system 100 may be used in the context ofAutomatic Call Distribution (ACD) centers to forecast transactions(e.g., calls or other communication requests) during certain periods oftime using forecast models.

The system 100 may include a transaction gap module 110, an externaldata source 120, a forecasting module 125, and a database 130. Thetransaction gap module 110 may include an interface 135 to receivetransaction data from the database 130 regarding, for example, aparticular forecast group and/or a particular period of time. Theinterface 135 may receive transaction data from the external data source120 through a network 160, such as the Internet.

The database 130 includes data regarding frequency of transactions orcalls during periods of time. The database 130 (and/or the external datasource 120) may include invalid, missing or incomplete data 165.

The transaction gap module 110 determines if there is a gap (e.g.,incomplete data 165) in the transaction data. The gap may be invaliddata, such as a data error and/or missing/omitted data (null). The gapmay be during a period of time, such as a day or a set of days in amonthly data set. A month (series of weeks) of (possibly incomplete)daily data and a list of dates of invalid data may be included in thetransaction data. For each valid date in the month, the data may be anon-negative number.

The transaction gap module 110 may also include a selection module 140used in determining which algorithm, a first algorithm 145 and/or asecond algorithm 150 to use to fill in the gap or gaps in transactiondata. An algorithm may replace the invalid, incomplete or missing data165 in the forecast group with plausible and/or likely values to rendera complete output. Several algorithm embodiments are described herein.For example, the first algorithm 145 may include a pattern recognitioncode 155. A month of daily data, where the data for each day in themonth is a non-negative number, may be the output of the algorithm ofthe transaction gap module 110.

The transaction gap module 110 then sends the output, complete data 170including the filled-in data, to the forecasting module 125 to forecasttransactions.

FIG. 2 illustrates a method 200 of choosing an algorithm to fill in atransaction data gap, according to an embodiment.

At block 210, transaction data is received, as discussed herein.

At block 220, a gap in the transaction data is determined, as discussedherein.

At block 230, the algorithm used to fill in the gap is determined. Thedetermined algorithm may depend on the size of the dataset.Additionally, and/or alternatively, the determined algorithm may dependon the desired accuracy of the filled-in data. Additionally, and/oralternatively, the determined algorithm may depend on the desired speedto fill in the missing or invalid data

The algorithm described in FIG. 4 may render more accurate results ascompared with the algorithm described in FIG. 3 when there is a largequantity of invalid data, e.g., greater than 50% of the days havemissing or invalid data for the given month/forecast group.

However, the algorithm described in FIG. 4 may be computationally moreexpensive as compared with the algorithm described in FIG. 3. That is,more time and more processing capabilities of a system may be expendedcomparatively with the algorithm FIG. 4, especially when the data setsare large. The first algorithm may be used when processing time forfilling in the gap may be minimized. The second algorithm may be usedwhen accuracy for filling in the gap is to be maximized.

FIG. 3 illustrates a method 300 of implementing an algorithm, accordingto an example embodiment of the present invention.

At block 310, transaction data is received, as discussed herein.

At block 320, a gap in the transaction data is determined, as discussedherein.

At block 330, a dominant pattern in the transaction data is determined,using the algorithm, as discussed herein. The dominant pattern may bedetermined by the pattern recognition code 155.

At block 340, a region within the dominant pattern that corresponds tothe gap in the transaction data may be identified, using the algorithm,as discussed herein.

At block 350, data associated with the corresponding region may beadopted into the gap to minimize impact on the dominant pattern, usingthe algorithm, as discussed herein.

Using the algorithm, invalid and/or missing data may be replaced withvalues that are consistent with the arrangement of the valid data. Thealgorithm and/or the transaction gap module 110 may also take intoconsideration any restrictions of the forecasting module 125 of theforecasting module. A forecasting module restriction may be that thenumber of calls during each week has the same pattern throughout themonth, for example.

The algorithm of the embodiment of FIG. 3 may work best when the validdata is not too sparse in a given month. The valid data is not toosparse, for example, when the ratio of valid data to invalid data isgreater than 1:1. The actual arrangement of days with invalid data andthe degree of dominance of the pattern in the valid data may also impactthe quality of the fill and/or a confidence in the fill.

Two examples of how the algorithm of FIG. 3 behaves for sparse validdata are described further below. Sparse valid data, as used here, maydenote a qualitative and comparative state of a set of the data wherethere is less valid data than in some other comparable set of data.

In the below examples, in the first algorithm where a dominant patternin the data may be determined and adopted to fill in the gap (e.g., nulldata sets), (i,j) refers to a j^(th) day of an i^(th) week, for n weekswith m days in each week, wherein x_(ij) includes valid numerical data,and if data is not valid on (i,j), x_(ij)=null.

v_(ij) includes v_(ij)=x_(ij), unless x_(ij)=0, in which case,v_(ij)=null, wherein w_(ij) includes w_(ij)=ln(v_(ij)) whenever v_(ij)is not null, and w_(ij)=null whenever v_(ij)=null.

A matrix of column differences, c_(ij), includes c_(ij)=w_(ij+1)−w_(ij)whenever both w_(ij+1) and w_(ij) are not null, and c_(ij)=null,otherwise.

A matrix of row differences, r_(ij), includes r_(ij)=w_(i+1j)−w_(ij)whenever both w_(i+1j) and w_(ij) are not null, and r_(ij)=null,otherwise.

A j^(th) column of c_(ij) includes at least one non-null entry, andc_(*j) includes an average of each non-null entry in the j^(th) columnof c_(ij), otherwise, c_(*j)=0.

An i^(th) row of r_(ij) includes at least one non-null entry, and r_(i*)includes an average of each non-null entry in the i^(th) row of r_(ij),otherwise, r_(i*)=0.

C_(j+1)=C_(j)+c_(*j), where C₁=0, wherein R_(i+1)=R_(i)+r_(i*), whereinR₁=0, and u_(ij)=R_(i)+C_(j).

K includes an average of w_(ij)−u_(ij) over each (i,j) entry wherew_(ij) is not null.

y_(ij)=w_(ij) whenever w_(ij) is not null and otherwise,y_(ij)=K+u_(ij).

Output z_(ij)=Round(exp(y_(ij))), where each date and time periodincludes valid data. z_(ij) is the matrix that is sent on to theforecasting model or module. z_(ij) may be sent through a sequence ofone or more modules to be analyzed. Results may then be sent to a modulethat updates parameters of the forecasting module.

Logarithms may be taken of particular values so that multiplicativeeffects between day-of-the-week and week-of-the-month may beconveniently expressed as additive effects. In some implementations, itmay be more convenient for the algorithm to work with additive effectsthan directly with the multiplicative effects. For example,multiplicative effect: m_effect=affect1*affec2; Additive:a_effect=affect3+affect4;log(m_effect)=log(affect1*affect2)=log(affect1)+log(affect2). By takinglogs, a multiplicative effect can be treated as an additive effect wherelog(m_effect)=a_effect, log(affect1)=affect3, log(affect2)=affect4.

A first example of how the above-recited functions of the algorithm ofFIG. 3 behaves for sparse valid data is as follows:

${{Where}\mspace{14mu} w_{ij}} = \begin{matrix}{null} & {- 2} & 1 & 3 & {null} & 0 & {- 3} \\7 & 3 & {null} & 8 & 10 & 5 & {null} \\{null} & {null} & {- 1} & {null} & {null} & {- 2} & {null} \\{null} & {null} & 4 & 6 & {null} & {null} & {null} \\{null} & {null} & 3 & {null} & {null} & {null} & {null}\end{matrix}$${{and}\mspace{14mu}{thus}\mspace{14mu}{where}\mspace{14mu} y_{ij}} = \begin{matrix}2 & {- 2} & 1 & 3 & 5 & 0 & {- 3} \\7 & 3 & 6 & 8 & 10 & 5 & 2 \\0 & {- 4} & {- 1} & 1 & 3 & {- 2} & {- 5} \\5 & 1 & 4 & 6 & 8 & 3 & 0 \\4 & 0 & 3 & 5 & 7 & 2 & {- 1}\end{matrix}$${{Here}\mspace{14mu}{is}\mspace{14mu}{another}\mspace{14mu}{example}\mspace{14mu}{where}\mspace{14mu} w_{ij}} = \begin{matrix}{null} & {null} & {null} & {null} & {null} & {null} & {null} \\{null} & {null} & {null} & {null} & {null} & {null} & {null} \\{null} & {null} & 1 & {null} & {null} & {null} & {null} \\{null} & {null} & {null} & {null} & {null} & {null} & {null} \\{null} & {null} & {null} & {null} & {null} & {null} & {null}\end{matrix}$${{And}\mspace{14mu}{thus}},{{{where}\mspace{14mu} y_{ij}} = \begin{matrix}1 & 1 & 1 & 1 & 1 & 1 & 1 \\1 & 1 & 1 & 1 & 1 & 1 & 1 \\1 & 1 & 1 & 1 & 1 & 1 & 1 \\1 & 1 & 1 & 1 & 1 & 1 & 1 \\1 & 1 & 1 & 1 & 1 & 1 & 1\end{matrix}}$

In another embodiment, the method is similar to “Fill in Days” formonthly updates described above, however day-of-the-week is replaced bytime-period and week-of-the-month is replaced by comparable date. In aparticular embodiment, n becomes the number of comparable dates, mbecomes the number of time-periods within a day, i becomes an index forcomparable dates and j becomes an index for time-period of a day. Thecalculations are completed using the above described functions in thealgorithm of FIG. 3.

FIG. 4 illustrates a method 400 of implementing another algorithm,according to an example embodiment of the present invention.

At block 410, transaction data may be received, as discussed herein.

At block 420, a gap in the transaction data may be determined, asdiscussed herein.

At block 430, a set of substitute data may be chosen from among a groupof substitute data sets using a Moore-Penrose pseudo-inverse algorithm.

At block 440, the set of substitute data may be adopted into thedetermined gap.

In an embodiment, the Moore-Penrose pseudo-inverse algorithm may be moreaccurate as compared with the algorithm of FIG. 3 when the valid data isquite sparse (when the count of valid data is, for example, less thann+m) and the invalid data is plentiful. However, the Moore-Penrosepseudo-inverse algorithm may be associated with much more computation(in both space and time), and therefore may be less practical,especially when the data sets are large. For example, a set comprisingseveral hundred comparable days where each day has one hundred periodsmay be considered large. A parameter may be set based on the data setsize, for example, by the user or the administrator to determine whichalgorithm to use.

In an embodiment, the Moore-Penrose pseudo-inverse algorithm may fill innull or invalid data by producing an optimal “fill in”.

Let w_(ij) be the same as defined above with regard to the algorithm ofFIG. 3, and let W denote the matrix of the w_(ij).

For p=1, 2, . . . , n+m and q=1, 2, . . . , n+m, let f_(pq) denote theelements of an n+m by n+m matrix, F, called the “filler”. The filler isa symmetric matrix, defined in the following way:

For p=1, 2, . . . , n and q=1, 2, . . . , n, let f_(pp)=the number ofnon-null entries in the p^(th) row of W and let f_(pq)=0 when p≠q. Forp=n+1, n+2, . . . , n+m and q=n+1, n+2, . . . , n+m, let f_(pp)=thenumber of non-null entries in the (p−n)^(th) column of W and letf_(pq)=0 when p≠q. For p=1, 2, . . . , n and q=n+1, n+2, . . . , n+m,let f_(pq)=1 when w_(pq−n) is not null and f_(pq)=0 when w_(pq−n) isnull. For p=n+1, n+2, . . . , n+m and q=1,2, . . . , n, let f_(pq)=1when w_(qp−n) is not null and f_(pq)=0 when w_(qp−n) is null.

If A is some real matrix and B is a real matrix such that ABA=B, BAB=A,AB is symmetric, and BA is symmetric, then B is called a Moore-Penrosepseudoinverse of A. It is a theorem that every real matrix has amathematically unique Moore-Penrose pseudoinverse. Let F⁺ denote thepseudoinverse of F. Let F⁺ be computed from F using, say, Greville'sTheorem.

Let b denote the average of the non-null values of W.

For i=1, 2, . . . , n and j=1, 2, . . . , m, define {tilde over(w)}_(ij) by the rule {tilde over (w)}_(ij)=w_(ij)−b when w_(ij) is notnull and {tilde over (w)}_(ij)=null otherwise. Let {tilde over (W)}denote the n by m matrix of the {tilde over (w)}_(ij).

Define a real vector, g, with n+m components g_(k), for k=1, 2, . . . ,n+m, by the following rules: For k=1, 2, . . . , n, let g_(k)=sum of thenon-null elements in the k^(th) row of {tilde over (W)} when at leastone such element is not null and let g_(k)=0 when every element in thek^(th) row of {tilde over (W)} is null.

For k=1+n, 2+n, . . . , m+n, let g_(k) equal the sum of the non-nullelement sin the (k−n)^(th) column of {tilde over (W)} when at least onesuch element is not null and let g_(k)=0 when every element in the(k−n)^(th) column of {tilde over (W)} is null.

Define a real vector, h, with n+m components h_(k), for k=1, 2, . . . ,n+m, by the following rule: h=F⁺g. The components of h are used todetermine values to replace the null data in W as follows: For i=1, 2, .. . , n, let R_(i)=h_(i). For j=1, 2, . . . , m, let C_(j)=h_(j+n).Define u_(ij) by the rule u_(ij)=R_(i)+C_(j). Let y_(ij)=w_(ij) wheneverw_(ij) is not null and otherwise, let y_(ij)=u_(ij)+b.

The real matrix of the y_(ij), Y, can be thought of as the matrix, W,with the null values filled in with data that is considered “valid”. Asdescribed above, W may be obtained by taking logarithms of the originaldata, x_(ij). Now let z_(ij)=x_(ij) wherever x_(ij) has valid data andlet z_(ij)=Round(exp(y_(ij))) otherwise.

Output the z_(ij).

In an example embodiment, the algorithm of FIG. 4 may be executed asfollows, using the same first matrix, W, used in the example of thealgorithm of FIG. 3, where W=

$\begin{bmatrix}\left\lbrack \begin{matrix}{null} & {- 2} & 1 & 3 & {null} & 0 & \left. {- 3} \right\rbrack\end{matrix} \right. \\\begin{matrix}\left\lbrack 7 \right. & 3 & {null} & 8 & 10 & 5 & \left. {null} \right\rbrack\end{matrix} \\\left\lbrack \begin{matrix}{null} & {null} & {- 1} & {null} & {null} & {- 2} & \left. {null} \right\rbrack\end{matrix} \right. \\\left\lbrack \begin{matrix}{null} & {null} & 4 & 6 & {null} & {null} & \left. {null} \right\rbrack\end{matrix} \right. \\\begin{matrix}\left\lbrack {null} \right. & {null} & 3 & {null} & {null} & {null} & \left. {null} \right\rbrack\end{matrix}\end{bmatrix}$${{the}\mspace{14mu}{corresponding}\mspace{14mu}{filler}},F,{= \begin{bmatrix}\begin{bmatrix}5 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 0 & 1 & 1\end{bmatrix} \\\begin{bmatrix}0 & 5 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 1 & 0\end{bmatrix} \\\begin{bmatrix}0 & 0 & 2 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 1 & 0\end{bmatrix} \\\begin{bmatrix}0 & 0 & 0 & 2 & 0 & 0 & 0 & 1 & 1 & 0 & 0 & 0\end{bmatrix} \\\begin{bmatrix}0 & 0 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & 0 & 0\end{bmatrix} \\\begin{bmatrix}0 & 1 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0\end{bmatrix} \\\begin{bmatrix}1 & 1 & 0 & 0 & 0 & 0 & 2 & 0 & 0 & 0 & 0 & 0\end{bmatrix} \\\begin{bmatrix}1 & 0 & 1 & 1 & 1 & 0 & 0 & 4 & 0 & 0 & 0 & 0\end{bmatrix} \\\begin{bmatrix}1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 3 & 0 & 0 & 0\end{bmatrix} \\\begin{bmatrix}0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0\end{bmatrix} \\\begin{bmatrix}1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 3 & 0\end{bmatrix} \\\begin{bmatrix}1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1\end{bmatrix}\end{bmatrix}}$ Thus, F⁺  is  approximately:

-   [2.31082375478927E-0001, -5.34003831417627E-0002,    -3.04118773946362E-0002, -3.04118773946361 E-0002,    -8.21360153256704E-0002, 1.36733716475096E-0001,    -4.71743295019157E-0002, -1.19731800766277E-0003,    -2.13122605363984E-0002, 1.36733716475096E-0001,    -2.13122605363983E-0002, -1.47749042145594E-0001]-   [-5.34003831417628E-0002, 2.82806513409962E-0001,    -1.07998084291187E-0001, -1.07998084291187E-0001, -2.28687739463601    E-0001, -1.99473180076628E-0001, -7.30363984674330E-0002,    1.45354406130268E-0001, -1.26915708812260E-0002,    -1.99473180076629E-0001, -1.26915708812261 E-0002,    1.36733716475096E-0001]-   [-3.04118773946362E-0002, -1.07998084291187E-0001,    5.77059386973180E-0001, -2.29406130268199E-0002,    3.56800766283525E-0002, 1.91331417624521E-0001,    1.10871647509578E-0001, -1.19013409961685E-0001,    8.15613026819923E-0002, 1.91331417624521 E-0001,    -1.18438697318007E-0001, 1.13745210727969E-0001]-   [-3.04118773946362E-0002, -1.07998084291187E-0001,    -2.29406130268199E-0002, 5.77059386973180E-0001,    3.56800766283525E-0002, 1.91331417624521E-0001,    1.10871647509578E-0001, -1.19013409961685E-0001,    -1.18438697318007E-0001, 1.91331417624521E-0001,    8.15613026819924E-0002, 1.13745210727969E-0001]-   [-8.21360153256706E-0002, -2.28687739463602E-0001,    3.56800766283525E-0002, 3.56800766283527E-0002,    1.19085249042146E+0000, 3.12021072796935E-0001,    1.97078544061303E-0001, -2.74185823754789E-0001,    1.19492337164750E-0001, 3.12021072796935E-0001, 1.19492337164751    E-0001, 1.65469348659004E-0001]-   [1.36733716475096E-0001, -1.99473180076629E-0001,    1.91331417624521E-0001, 1.91331417624521E-0001,    3.12021072796935E-0001, 1.11613984674330E+0000,    -1.02969348659003E-0002, -2.28687739463601 E-0001,    -7.06417624521073E-0002, 1.16139846743295E-0001,    -7.06417624521072E-0002, -2.20067049808430E-0001]-   [4.71743295019156E-0002, -7.30363984674331 E-0002,    1.10871647509578E-0001, 1.10871647509578E-0001,    1.97078544061303E-0001, -1.02969348659003E-0002,    5.18438697318008E-0001, -1.13745210727969E-0001,    -2.46647509578544E-0002, -1.02969348659001 E-0002,    -2.46647509578544E-0002, -3.61590038314179E-0002]-   [-1.19731800766279E-0003, 1.45354406130268E-0001,    -1.19013409961685E-0001, -1.19013409961686E-0001,    -2.74185823754789E-0001, -2.28687739463602E-0001,    -1.13745210727969E-0001, 3.57519157088123E-0001,    -3.61590038314176E-0002, -2.28687739463601 E-0001,    -3.61590038314176E-0002, -8.21360153256707E-0002]-   [-2.13122605363983E-0002, -1.26915708812260E-0002,    8.15613026819924E-0002, -1.18438697318007E-0001,    1.19492337164750E-0001, -7.06417624521073E-0002,    -2.46647509578544E-0002, -3.61590038314177E-0002,    3.56369731800766E-0001, -7.06417624521072E-0002,    -4.36302681992338E-0002, -6.20210727969350E-0002]-   [1.36733716475096E-0001, -1.99473180076629E-0001,    1.91331417624521E-0001, 1.91331417624521E-0001,    3.12021072796935E-0001, 1.16139846743295E-0001,    -1.02969348659002E-0002, -2.28687739463602E-0001,    -7.06417624521074E-0002, 1.11613984674330E+0000,    -7.06417624521072E-0002, -2.20067049808430E-0001]-   [-2.13122605363983E-0002, -1.26915708812263E-0002,    -1.18438697318007E-0001, 8.15613026819924E-0002,    1.19492337164750E-0001, -7.06417624521073E-0002,    -2.46647509578544E-0002, -3.61590038314175E-0002,    -4.36302681992337E-0002, -7.06417624521071 E-0002,    3.56369731800766E-0001, -6.20210727969353E-0002]-   [-1.47749042145596E-0001, 1.36733716475097E-0001,    1.13745210727970E-0001, 1.13745210727969E-0001,    1.65469348659004E-0001, -2.20067049808429E-0001,    -3.61590038314176E-0002, -8.21360153256708E-0002,    -6.20210727969352E-0002, -2.20067049808430E-0001,    -6.20210727969354E-0002, 1.06441570881226E+0000]]

For the first matrix, W, b=2.8, and the elements of {tilde over (W)}include:

$\begin{bmatrix}\begin{bmatrix}{null} & {- 4.8} & {- 1.8} & {.2} & {null} & {- 2.8} & {- 5.8}\end{bmatrix} \\\begin{bmatrix}4.2 & {.2} & {null} & 5.2 & 7.2 & 2.2 & {null}\end{bmatrix} \\\begin{bmatrix}{null} & {null} & {- 3.8} & {null} & {null} & {- 4.8} & {null}\end{bmatrix} \\\begin{bmatrix}{null} & {null} & 1.2 & 3.2 & {null} & {null} & {null}\end{bmatrix} \\\begin{bmatrix}{null} & {null} & {.2} & {null} & {null} & {null} & {null}\end{bmatrix}\end{bmatrix}\quad$

g is given by

$g = \begin{pmatrix}{- 15} \\19 \\{- 8.6} \\4.4 \\{.2} \\4.2 \\{- 4.6} \\{- 4.2} \\8.6 \\7.2 \\{- 5.4} \\{- 5.8}\end{pmatrix}$

Finding F⁺ by Greville's Theorem, computing h F⁺g, and solving for they_(ij) in terms of the components of h recovers a matrix that isidentical to the y_(ij) matrix generated by the algorithm of FIG. 3.However, the computations for the y_(ij) matrix generated by thealgorithm of FIG. 4, may be computationally more expensive.

The component of the algorithm described here, acts upon the logarithmsof the raw data, in the instance where that raw data is not null and notzero. The logarithms may be placed in a (not real) n by m matrix, W,whose elements are either real numbers or null, where at least one entryis not null.

In an embodiment of the algorithm of FIG. 3, let w_(ij) denote theentries of the logarithm matrix, W. Each w_(ij) is either a real numberor null. For any set, A, let o(A) denote the cardinality of A.

The set, S, may be defined by the rule S={(i,j)|w_(ij≠null}.)

μ may be defined by the rule

$\mu = {\frac{1}{o(S)}{\sum\limits_{{({i,j})} \in S}{w_{ij}.}}}$

y_(ij) may be defined by the rule

$y_{ij} = \left\{ {\begin{matrix}{{w_{ij} - \mu};} & {\left( {i,j} \right) \in S} \\{{null};} & {\left( {i,j} \right) \notin S}\end{matrix}.} \right.$

Y may be defined to be the matrix of y_(ij).

V may be defined to be a real-valued function of n+m real variables sothat V=V(r₁, . . . , r_(n, c) ₁, . . . , c_(m)) where

${V\left( {r_{1},\ldots\mspace{11mu},r_{n},c_{1},\ldots\mspace{11mu},c_{m}} \right)} = {\sum\limits_{{({i,j})} \in S}{\left( {y_{ij} - r_{i} - c_{j}} \right)^{2}.}}$

V is a non-negative quadratic function, so V may have a global minimumvalue, but there may be many values of (r₁, . . . , r_(n), c₁, . . . ,c_(m)) that achieve this minimum value of V. To find a minimum of V,points where V is stationary are sought. That is, where

$\begin{matrix}{{{\frac{\partial V}{\partial r_{k}} = 0};}\;} & {{k = 1},\ldots\mspace{11mu},n} \\{{\frac{\partial V}{\partial c_{l}} = 0};} & {{l = 1},\ldots\mspace{11mu},m}\end{matrix},\mspace{11mu}{but}$ $\begin{matrix}\begin{matrix}{\frac{\partial V}{\partial r_{k}} = {- {\sum\limits_{{({i,j})} \in S}{2\left( {y_{ij} - r_{i} - c_{j}} \right)\delta_{ik}}}}} \\{{= {- {\sum\limits_{{({k,j})} \in S}{2\left( {y_{kj} - r_{k} - c_{j}} \right)}}}},}\end{matrix} & \;\end{matrix}$ for  k = 1, …  , n and $\begin{matrix}\begin{matrix}{\frac{\partial V}{\partial c_{l}} = {- {\sum\limits_{{({i,j})} \in S}{2\left( {y_{ij} - r_{i} - c_{j}} \right)\delta_{jl}}}}} \\{{= {- {\sum\limits_{{({i,l})} \in S}{2\left( {y_{il} - r_{i} - c_{l}} \right)}}}},}\end{matrix} & \;\end{matrix}$ for  l = 1, …  , m.

Therefore a minimum satisfies:

$\begin{matrix}{{{\sum\limits_{{({k,j})} \in S}{2\left( {y_{ij} - r_{i} - c_{j}} \right)}} = 0};} & {{k = 1},\ldots\mspace{11mu},n}\end{matrix}$ $\begin{matrix}{{{\sum\limits_{{({i,l})} \in S}{2\left( {y_{il} - r_{i} - c_{l}} \right)}} = 0};} & {{l = 1},\ldots\mspace{11mu},m}\end{matrix}$

The first n sums may be over “non-null” elements in the k^(th) row of Y.The second m sums may be over the “non-null” elements in the l^(th)column of Y.

Let P_(k)={j|(i,j)∈S and i=k} and let Q_(l)={i|(i,j)∈S and j=l}. Thesystem of equations may be written as

$\begin{matrix}{{{\sum\limits_{j \in P_{k}}\left( {y_{kj} - r_{k} - c_{j}} \right)} = 0};} & {{k = 1},\ldots\mspace{11mu},n}\end{matrix}$ $\begin{matrix}{{{\sum\limits_{i \in Q_{l}}\left( {y_{il} - r_{i} - c_{l}} \right)} = 0};} & {{l = 1},\ldots\mspace{11mu},m}\end{matrix}$ or $\begin{matrix}{{{{\sum\limits_{j \in P_{k}}r_{k}} + {\sum\limits_{j \in P_{k}}c_{j}}} = {\sum\limits_{j \in P_{k}}y_{kj}}};} & {{k = 1},\ldots\mspace{11mu},n}\end{matrix}$ $\begin{matrix}{{{{\sum\limits_{i \in Q_{l}}r_{i}} + {\sum\limits_{i \in Q_{l}}c_{l}}} = {\sum\limits_{i \in Q_{l}}y_{il}}};} & {{l = 1},\ldots\mspace{11mu},m}\end{matrix}$ or $\begin{matrix}{{{{{O\left( P_{k} \right)}r_{k}} + {\sum\limits_{j \in P_{k}}c_{j}}} = {\sum\limits_{j \in P_{k}}y_{kj}}};} & {{k = 1},\ldots\mspace{11mu},n}\end{matrix}$ $\begin{matrix}{{{{\sum\limits_{i \in Q_{l}}r_{i}} + {{O\left( Q_{l} \right)}c_{l}}} = {\sum\limits_{i \in Q_{l}}y_{il}}};} & {{l = 1},\ldots\mspace{11mu},{m.}}\end{matrix}$

Note that o(P_(k)) is the number of non-null elements in the k^(th) rowof Y and that o(Q_(l)) is the number of non-null elements in the l^(th)column of Y. Also note that

$\sum\limits_{j \in P_{k}}y_{kj}$is the sum of the non-null elements in the k^(th) row of Y and

$\sum\limits_{i \in Q_{l}}y_{il}$is the sum of the non-null elements in the l^(th) column of Y.

The system of equations shown above comprises n+m simultaneous linearequations in n+m variables. As such, the system of equations may beexpressed as a vector-matrix equation in R^(n+m) of the form Fh=g, whereF is an n+m by n+m real matrix and both g and h are vectors in R^(n+m).

${{{The}\mspace{14mu}{vecors}\mspace{14mu} h\mspace{14mu}{and}\mspace{14mu} g\text{:}\mspace{14mu} h} = \begin{pmatrix}r_{1} \\\vdots \\r_{n} \\c_{1} \\\vdots \\c_{m}\end{pmatrix}},{g = {\begin{pmatrix}{\sum\limits_{j \in P_{1}}y_{1j}} \\\vdots \\{\sum\limits_{j \in P_{n}}y_{nj}} \\{\sum\limits_{i \in Q_{1}}y_{i\; 1}} \\\vdots \\{\sum\limits_{i \in Q_{m}}y_{im}}\end{pmatrix}.}}$

In order to describe F, the symbol, ε_(ij), may be used, where ε_(ij=)1,when y_(ij) is not null and ε_(ij)=0, when y_(ij) is null.

$F = {\begin{pmatrix}{o\left( P_{1} \right)} & 0 & \cdots & 0 & ɛ_{11} & ɛ_{12} & \cdots & ɛ_{1m} \\0 & {o\left( P_{2} \right)} & \cdots & 0 & ɛ_{21} & ɛ_{22} & \cdots & ɛ_{2m} \\\vdots & \vdots & ⋰ & \vdots & \vdots & \vdots & ⋰ & \vdots \\0 & 0 & \cdots & {o\left( P_{n} \right)} & ɛ_{n\; 1} & ɛ_{n\; 2} & \cdots & ɛ_{n\; m} \\ɛ_{11} & ɛ_{21} & \cdots & ɛ_{n\; 1} & {o\left( Q_{1} \right)} & 0 & \cdots & 0 \\ɛ_{12} & ɛ_{22} & \cdots & ɛ_{n\; 2} & 0 & {o\left( Q_{2} \right)} & \cdots & 0 \\\vdots & \vdots & ⋰ & \vdots & \vdots & \vdots & ⋰ & \vdots \\ɛ_{1m} & ɛ_{2m} & \cdots & ɛ_{n\; m} & 0 & 0 & \cdots & {o\left( Q_{m} \right)}\end{pmatrix}.}$

The matrix F is a symmetric matrix. The elements on the diagonal of thematrix F may be expressed in terms of the ε_(ij) term, as follows:

$F = {\begin{pmatrix}\underset{j}{\sum ɛ_{1j}} & 0 & \cdots & 0 & ɛ_{11} & ɛ_{12} & \cdots & ɛ_{1m} \\0 & {\sum\limits_{j}ɛ_{2j}} & \cdots & 0 & ɛ_{21} & ɛ_{22} & \cdots & ɛ_{2m} \\\vdots & \vdots & ⋰ & \vdots & \vdots & \vdots & ⋰ & \vdots \\0 & 0 & \cdots & {\sum\limits_{j}ɛ_{nj}} & ɛ_{n\; 1} & ɛ_{n\; 2} & \cdots & ɛ_{n\; m} \\ɛ_{11} & ɛ_{21} & \cdots & ɛ_{n\; 1} & {\sum\limits_{i}ɛ_{i\; 1}} & 0 & \cdots & 0 \\ɛ_{12} & ɛ_{22} & \cdots & ɛ_{n\; 2} & 0 & {\sum\limits_{i}ɛ_{i\; 2}} & \cdots & 0 \\\vdots & \vdots & ⋰ & \vdots & \vdots & \vdots & ⋰ & \vdots \\ɛ_{1m} & ɛ_{2m} & \cdots & ɛ_{n\; m} & 0 & 0 & \cdots & {\sum\limits_{i}ɛ_{im}}\end{pmatrix}.}$

The equation Fh=g includes at least one solution, and possibly aninfinite number of solutions. An infinite number of values may minimizeV=V(r₁, . . . , r_(n), c_(l), . . . , c_(m)). The solution chosen to usefor the fill in may be the solution that leads to a most conservativeapproximation of the y_(ij) by the values of r_(i)+c_(j). Such asolution, h, is one for which ∥h∥ is minimum. In other words, find an h,such that Fh=g and ∥h∥ is minimum. Such as h may be found by means ofthe pseudoinverse of F. The pseudoinverse of F is a mathematicallyunique matrix, denoted F⁺. The solution for h, such that ∥h∥ is minimum,may be given by h=F⁺g.

This result follows from the definition of pseudoinverse, where:

-   FF⁺F=F, F⁺FF⁺=F⁺, FF⁺=(FF⁺)^(T), and F⁺F=(F⁺F)^(T).

The above-recited relations imply that (F⁺F)(F⁺F)=F⁺F and(FF⁺)(FF⁺)=FF⁺, so that, in virtue of their symmetries, F⁺F and FF⁺ areboth projections. For any x in R^(n+m), either of these projectionsdetermines a decomposition of x into orthogonal components:x=(I−F ⁺ F)x+(F ⁺ F)x or x=(l−FF ⁺)x+(FF ⁺)x,so that(x,x)=((I−F ⁺ F)x,(I−F ⁺ F)x)+((F ⁺ F)x,(F ⁺ F)x)or (x,x)=((I−FF ⁺)x,(I−FF ⁺)x,(I−FF ⁺)x)+((FF ⁺ x),(FF ⁺ x)),respectively.

(F⁺Fx,F⁺Fx)≦(x,x) and (FF⁺x,FF⁺x)≦(x,x) for any x in R^(n+m). Also, if(F⁺Fx,F⁺Fx)=(x,x) or (FF⁺x,FF⁺x)=(x,x), respectively, then((1−F⁺F)x,(I−F⁺F)x)=0 or ((I−FF⁺)x,(I−FF⁺)x)=0, respectively, so that(I−F⁺F)x=0 or (I−FF⁺x=0, respectively. This forces F⁺Fx=x or FF⁺x=x,respectively. Therefore, if (F⁺Fx,F⁺Fx)=(x,x) then F⁺Fx=x and if (FF⁺x,FF⁺x)=(x,x) then FF⁺x=x.

{tilde over (h)} may be defined by the rule {tilde over (h)}=F⁺g. ThenF{tilde over (h)}=FF⁺g, F⁺F{tilde over (h)}=F⁺FF⁺g=F⁺g={tilde over (h)},so that F⁺F{tilde over (h)}={tilde over (h)}.

Suppose there is an h such that F h=g, then F⁺F h=F⁺g={tilde over (h)}and so that ({tilde over (h)},{tilde over (h)})=(F⁺F h,F⁺F h)≦( h, h)for any solution, h.

F{tilde over (h)}=FF⁺F h=F h=g, therefore, {tilde over (h)} is asolution to Fh=g for which ∥h∥ is minimum. Furthermore, suppose ( h,h)=({tilde over (h)},{tilde over (h)}) then ( h, h)=(F⁺g,F⁺g)=(F⁺F h,F⁺Fh) and, because F⁺F is a projection, F⁺F h= h by implication. Again,because h is a solution, F⁺g= h; but F⁺g={tilde over (h)}, so {tildeover (h)}= h. Therefore, if ( h, h)=({tilde over (h)},{tilde over (h)})then {tilde over (h)}= h. Therefore, {tilde over (h)}=F⁺g is amathematically unique solution to Fh=g, for which ∥h∥ is minimum.

The components of {tilde over (h)} give the values of r_(i) and c_(j)used to fill in the null values of W as follows: If (i,j)∉S, thenw_(ij)=r_(i)+c_(j)+μ. Otherwise, the value of w_(ij) remains unchanged.

The automated update algorithms described herein may make consistentjudgments about enormous quantities of numerical data, and may reducethe risk that clerical errors associated with manual update activitiesmay deform the forecast model. Automated introduction of the new datamay avoid inappropriate changes in the day of week patterns that areextracted from the data, which may reduce deformation of the forecastmodel.

Computer Architecture

FIG. 5 shows a diagrammatic representation of machine in the exampleform of a computer system 600 within which a set of instructions, forcausing the machine to perform any one or more of the methodologiesdiscussed herein, may be executed. In alternative embodiments, themachine operates as a standalone device or may be connected (e.g.,networked) to other machines. In a networked deployment, the machine mayoperate in the capacity of a server or a client machine in server-clientnetwork environment, or as a peer machine in a peer-to-peer (ordistributed) network environment. The machine may be a personal computer(PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant(PDA), a cellular telephone, a web appliance, a network router, switchor bridge, or any machine capable of executing a set of instructions(sequential or otherwise) that specify actions to be taken by thatmachine. Further, while only a single machine is illustrated, the term“machine” shall also be taken to include any collection of machines thatindividually or jointly execute a set (or multiple sets) of instructionsto perform any one or more of the methodologies discussed herein.

The example computer system 600 includes a processor 602 (e.g., acentral processing unit (CPU), a graphics processing unit (GPU) orboth), a main memory 604 and a static memory 606, which communicate witheach other via a bus 608. The computer system 600 may further include avideo display unit 610 (e.g., a liquid crystal display (LCD) or acathode ray tube (CRT)). The computer system 600 also includes analphanumeric input device 612 (e.g., a keyboard), a user interface (UI)navigation device 614 (e.g., a mouse), a disk drive unit 616, a signalgeneration device 618 (e.g., a speaker) and a network interface device620.

The disk drive unit 616 includes a machine-readable medium 622 on whichis stored one or more sets of instructions and data structures (e.g.,software 624) embodying or utilized by any one or more of themethodologies or functions described herein. The software 624 may alsoreside, completely or at least partially, within the main memory 604and/or within the processor 602 during execution thereof by the computersystem 600, the main memory 604 and the processor 602 also constitutingmachine-readable media.

The software 624 may further be transmitted or received over a network626 via the network interface device 620 utilizing any one of a numberof well-known transfer protocols (e.g., HTTP).

While the machine-readable medium 622 is shown in an example embodimentto be a single medium, the term “machine-readable medium” should betaken to include a single medium or multiple media (e.g., a centralizedor distributed database, and/or associated caches and servers) thatstore the one or more sets of instructions. The term “machine-readablemedium” shall also be taken to include any medium that is capable ofstoring, encoding or carrying a set of instructions for execution by themachine and that cause the machine to perform any one or more of themethodologies of the present invention, or that is capable of storing,encoding or carrying data structures utilized by or associated with sucha set of instructions. The term “machine-readable medium” shallaccordingly be taken to include, but not be limited to, solid-statememories, optical and magnetic media, and carrier wave signals. Althoughan embodiment of the present invention has been described with referenceto specific example embodiments, it will be evident that variousmodifications and changes may be made to these embodiments withoutdeparting from the broader spirit and scope of the invention.Accordingly, the specification and drawings are to be regarded in anillustrative rather than a restrictive sense.

1. A method comprising: receiving incomplete transaction data at aninterface to a processing circuit; determining by the processing circuita gap in the incomplete transaction data; and using an algorithmimplemented by the processing circuit to generate data to fill in thegap and to generate complete transaction data, wherein the algorithm isselected by the processing circuit from a group including at least afirst algorithm and a second algorithm, wherein the first algorithm isautomatically to: determine a dominant pattern in the transaction data;identify a region within the dominant pattern that corresponds to thegap in the transaction data; and adopt data associated with thecorresponding region in the gap to minimize impact on the dominantpattern; wherein the second algorithm includes a Moore-Penrosepseudo-inverse algorithm to choose at least a portion of the transactiondata to fill in the gap based on a set of substitute data from among agroup of substitute data sets and to adopt the set of substitute datainto the gap; and where the first algorithm includes (i, j) referring toa j^(th) day of an i^(th) week, for n weeks with m days in each week,wherein x_(ij) includes valid numerical data, and if the data is notvalid on (i,j), x_(ij)=null, wherein v_(ij) includes v_(ij)=x_(ij),unless x_(ij)=0, in which case, v_(ij)=null, wherein w_(ij) includesw_(ij)=ln(v_(ij)) whenever v_(ij) is not null, and w_(ij)=null wheneverv_(ij)=null, wherein a matrix of column differences, c_(ij), includesc_(ij)=w_(ij+1)−w_(ij) whenever both w_(ij+1) and w_(ij) are not null,and c_(ij)=null, otherwise, wherein a matrix of row differences, r_(ij),includes r_(ij)=w_(i+1j)−w_(ij) whenever both w_(i+1j) and w_(ij) arenot null, and r_(ij)=null, otherwise, wherein a j^(th) column of c_(ij)includes at least one non-null entry, and c_(*j) includes an average ofeach non-null entry in the j^(th) column of c_(ij), otherwise,c_(*j)=0,wherein an i^(th) row of r_(ij) includes at least one non-null entry,and r_(i*) includes an average of each non-null entry in the i^(th) rowof r_(ij), otherwise, r_(i*)=0, wherein C_(j+1)=C_(j)+c_(*j),where C₁=0,wherein R_(i+1), =R_(i)+r_(i*), where R₁=0, wherein u_(ij)+R_(I)+C_(j),wherein K includes an average of w_(ij)−u_(ij) over each (i, j) entrywhere w_(ij) is not null, wherein y_(ij)=w_(ij) whenever w_(ij) is notnull and otherwise, y_(ij)=K+u_(ij), wherein outputz_(ij)=Round(exp(y_(ij))), wherein the output z_(ij) corresponds tofilling in the gap.
 2. The method of claim 1 wherein the algorithm isselected based upon at least one of amount of the transaction data,forecasting module restrictions, and a ratio of valid data to gap data.3. The method of claim 1 wherein the algorithm is selected based uponone of maximizing accuracy for filling in the gaps, and minimizingprocessing time for filling in the gap.
 4. The method of claim 1 whereinthe second algorithm includes an equation Fh=g, wherein Fh=g includes aplurality of solutions, for h, wherein a solution from the plurality ofsolutions that is selected to fill in the gap is the solution for h,such that ∥h∥ is minimized solving for h=F⁺g, wherein a pseudoinverse ofF includes F⁺, wherein vectors h and g include: ${h = \begin{pmatrix}r_{1} \\\vdots \\r_{n} \\c_{1} \\\vdots \\c_{m}\end{pmatrix}},{g = \begin{pmatrix}{\sum\limits_{j \in P_{1}}y_{1j}} \\\vdots \\{\sum\limits_{j \in P_{n}}y_{nj}} \\{\sum\limits_{i \in Q_{1}}y_{i\; 1}} \\\vdots \\{\sum\limits_{i \in Q_{m}}y_{im}}\end{pmatrix}},$  respectively, wherein ${F = \begin{pmatrix}{\sum\limits_{j}ɛ_{1j}} & 0 & \cdots & 0 & ɛ_{11} & ɛ_{12} & \cdots & ɛ_{1m} \\0 & {\sum\limits_{j}ɛ_{2j}} & \cdots & 0 & ɛ_{21} & ɛ_{22} & \cdots & ɛ_{2m} \\\vdots & \vdots & ⋰ & \vdots & \vdots & \vdots & ⋰ & \vdots \\0 & 0 & \cdots & {\sum\limits_{j}ɛ_{nj}} & ɛ_{n\; 1} & ɛ_{n\; 2} & \cdots & ɛ_{n\; m} \\ɛ_{11} & ɛ_{21} & \cdots & ɛ_{n\; 1} & {\sum\limits_{i}ɛ_{i\; 1}} & 0 & \cdots & 0 \\ɛ_{12} & ɛ_{22} & \cdots & ɛ_{n\; 2} & 0 & {\sum\limits_{i}ɛ_{i\; 2}} & \cdots & 0 \\\vdots & \vdots & ⋰ & \vdots & \vdots & \vdots & ⋰ & \vdots \\ɛ_{1m} & ɛ_{2m} & \cdots & ɛ_{n\; m} & 0 & 0 & \cdots & {\sum\limits_{i}ɛ_{im}}\end{pmatrix}},$ wherein a matrix of column differences, c_(ij),includes c_(ij)=w_(ij+1)−w_(ij) whenever both w_(ij+1) and w_(ij) arenot null, and c_(ij)=null, otherwise, wherein a matrix of rowdifferences, r_(ij), includes r_(ij)=w_(i+1j)−w_(ij) whenever bothw_(i+1j) and w_(ij) are not null, and r_(ij)=null, otherwise, wherein(i, j) refers to a j^(th) day of an i^(th) week, for n weeks with m daysin each week, wherein x_(ij) includes valid numerical data, and if datais not valid on (i, j), x_(ij)=null, wherein v_(ij) includesv_(ij)=x_(ij), unless x_(ij)=0, in which case, v_(ij)=null, whereinw_(ij) includes w_(ij)=ln(v_(ij)) whenever v_(ij) is not null, andw_(ij)=null whenever v_(ij)=null, wherein in the matrix F the symbolε_(ij), where ε_(ij)=1 when y_(ij) is not null, and ε_(ij)=0 when y_(ij)is null, wherein $y_{ij} = \left\{ {\begin{matrix}{{x_{ij} - \mu};{\left( {i,j} \right) \in S}} \\{{null};{\left( {i,j} \right) \notin S}}\end{matrix},} \right.$ wherein x_(ij) denotes entries of a logarithmmatrix, X, wherein a set, S=$\left\{ {\left( {i,j} \right)❘{x_{ij} \neq {null}}} \right\},{{{wherein}\mspace{14mu}\mu} = {\frac{1}{o(S)}{\sum\limits_{{({i,j})} \in S}{x_{ij}.}}}}$5. The method of claim 1, including forecasting future transactionactivity utilizing the complete transaction data.
 6. A machine-readablestorage medium storing a sequence of instructions that, when executed bya computer, cause the computer to perform the method comprising:receiving incomplete transaction data; determining a gap in theincomplete transaction data; and using an algorithm to generate data tofill in the gap and to generate complete transaction data, wherein thealgorithm is selected from a group including a first algorithm and asecond algorithm, wherein the first algorithm is automatically to:determine a dominant pattern in the transaction data; identify a regionwithin the dominant pattern that corresponds to the gap in thetransaction data; and adopt data associated with the correspondingregion in the gap to minimize impact on the dominant pattern; andwherein the second algorithm includes a Moore-Penrose pseudo-inversealgorithm to choose at least a portion of the transaction data to fillin the gap based on a set of substitute data from among a group ofsubstitute data sets and to adopt the set of substitute data into thegap; and wherein the first algorithm includes (i, j) referring to aj^(th) day of an i^(th) week, for n weeks with m days in each week,wherein x_(ij), includes valid numerical data, and if the data is notvalid on (i, j), x_(ij)=null, wherein v_(ij) includes v_(ij)=x_(ij),unless x_(ij)=0, in which case, v_(ij)=null, wherein w_(ij) includesw_(ij)=ln(v_(ij)) whenever v_(ij) is not null, and w_(ij)=null wheneverv_(ij)=null, wherein a matrix of column differences, c_(ij), includesc_(ij)=−w_(ij+1−w) _(ij) whenever both w_(ij+1) and w_(ij) are not null,and c_(ij)=null, otherwise, wherein a matrix of row differences, r_(ij),includes r_(ij)=w_(i+1j)−w_(ij)whenever both w_(i+1j) and w_(ij) are notnull, and r_(ij)=null, otherwise, wherein a j^(th) column of c_(ij)includes at least one non-null entry, and c_(*j) includes an average ofeach non-null entry in the j^(th) column of c_(ij), otherwise, c_(*j)=0,wherein an i^(th) row of r_(ij) includes at least one non-null entry,and r_(i*) includes an average of each non-null entry in the i^(th) rowof r_(ij), otherwise, r_(i*)=0, wherein C_(j+1)=C_(j)+c_(*j), whereC₁=0, wherein R_(i+1)=R_(i)+r_(i*),where R₁=0, whereinu_(ij)=R_(i)+C_(j), wherein K includes an average of w^(ij)−u_(ij) overeach (i, j) entry where w^(ij) is not null, wherein y_(ij)=w_(ij)whenever w_(ij) is not null and otherwise, y_(ij)=K+u_(ij), whereinoutput z_(ij)=Round(exp(y_(ij))), wherein the output z_(ij) correspondsto filling in the gap.
 7. A system comprising: an interface to receivetransaction data; and a transaction gap processor module configured to:determine a gap in the transaction data; determine a dominant pattern inthe transaction data; identify a region within the dominant pattern thatcorresponds to the gap in the transaction data; and adopt dataassociated with the corresponding region into the gap to minimize impacton the dominant pattern wherein the transaction gap processor moduleincorporates an algorithm that includes a formula for outputz_(ij)=Round(exp(y_(ij))), wherein the output z_(ij) corresponds tofilling in the gap, wherein (i, j) refers to a j^(th) day of an i^(th)week, for n weeks with m days in each week, wherein y_(ij)=w_(ij)whenever w_(ij) is not null and otherwise, y_(ij)=K +u_(ij), wherein Kincludes an average of w_(ij)−u_(ij) over each (i, j) entry where w_(ij)is not null, wherein C_(j+1)=C_(j)+c_(*j), where C₁=0, whereinR_(i+1)=R_(i)+r_(i*), where R₁=0, wherein u_(ij)=R_(i)+C_(j), wherein amatrix of column differences, c_(ij), includes c_(ij)=w_(ij=1)−w_(ij)whenever both w_(ij+1) and w_(ij) are not null, and c_(ij)=null,otherwise, wherein a matrix of row differences, r_(ij), includesr_(ij)=w_(i+1j)−w_(ij) whenever both w_(i+1j) and w_(ij) are not null,and r_(ij)=null, otherwise, wherein a j^(th) column of c_(ij) includesat least one non-null entry, and c_(*j) includes an average of eachnon-null entry in the j^(th) column of c_(ij), otherwise, c_(*j)=0,wherein an i^(th) row of r_(ij) includes at least one non-null entry,and r_(i*) includes an average of each non-null entry in the i^(th) rowof r_(ij), otherwise, r_(i*)=0, wherein x_(ij) includes valid numericaldata, and if the data is not valid on (i, j), x_(ij)=null, whereinv_(ij) includes v_(ij)=x_(ij), unless x_(ij)=0, in which case,v_(ij)=null, wherein w_(ij) includes w_(ij)=ln(v_(ij)) whenever v_(ij)is not null, and w_(ij)=null whenever v_(ij)=null.
 8. A systemcomprising: an interface to receive transaction data; a transaction gapprocessor module configured to: determine a gap in the transaction data;use a Moore-Penrose pseudo-inverse algorithm to determine transactiondata to fill in the gap based on a set of substitute data from among agroup of substitute data sets; and adopt the set of substitute data intothe gap; wherein the transaction gap module includes an equation Fh=g,wherein Fh=g includes a plurality of solutions, for h, wherein asolution from the plurality of solutions that is selected to fill in thegap is the solution for h, such that ∥h∥ is minimized solving for h=F⁺g,wherein a pseudoinverse of F includes F⁺, wherein vectors h and ginclude: ${h = \begin{pmatrix}r_{1} \\\vdots \\r_{n} \\c_{1} \\\vdots \\c_{m}\end{pmatrix}},{g = \begin{pmatrix}{\sum\limits_{j \in P_{1}}y_{1j}} \\\vdots \\{\sum\limits_{j \in P_{n}}y_{nj}} \\{\sum\limits_{i \in Q_{1}}y_{i\; 1}} \\\vdots \\{\sum\limits_{i \in Q_{m}}y_{im}}\end{pmatrix}},{respectively},{{{wherein}\mspace{14mu} F} = \begin{pmatrix}{\sum\limits_{j}ɛ_{1j}} & 0 & \cdots & 0 & ɛ_{11} & ɛ_{12} & \cdots & ɛ_{1m} \\0 & {\sum\limits_{j}ɛ_{2j}} & \cdots & 0 & ɛ_{21} & ɛ_{22} & \cdots & ɛ_{2m} \\\vdots & \vdots & ⋰ & \vdots & \vdots & \vdots & ⋰ & \vdots \\0 & 0 & \cdots & {\sum\limits_{j}ɛ_{nj}} & ɛ_{n\; 1} & ɛ_{n\; 2} & \cdots & ɛ_{n\; m} \\ɛ_{11} & ɛ_{21} & \cdots & ɛ_{n\; 1} & {\sum\limits_{i}ɛ_{i\; 1}} & 0 & \cdots & 0 \\ɛ_{12} & ɛ_{22} & \cdots & ɛ_{n\; 2} & 0 & {\sum\limits_{i}ɛ_{i\; 2}} & \cdots & 0 \\\vdots & \vdots & ⋰ & \vdots & \vdots & \vdots & ⋰ & \vdots \\ɛ_{1m} & ɛ_{2m} & \cdots & ɛ_{n\; m} & 0 & 0 & \cdots & {\sum\limits_{i}ɛ_{im}}\end{pmatrix}},$ wherein a matrix of column differences, c_(ij),includes c_(ij)=w_(ij+1)−w_(ij) whenever both w_(ij+1) and w_(ij) arenot null, and c_(ij)=null, otherwise, wherein a matrix of rowdifferences, r_(ij), includes r_(ij)=w_(i+1j)−w_(ij) whenever bothw_(i+1j) and w_(ij) are not null, and r_(ij)=null, otherwise, wherein(i, j) refers to a j^(th) day of an i^(th) week, for n weeks with m daysin each week, wherein x_(ij) includes valid numerical data, and if datais not valid on (i, j), x_(ij)=null, wherein v_(ij) includesv_(ij)=x_(ij), unless x_(ij)=0, in which case, v_(ij)=null, whereinw_(ij) includes w_(ij)=ln(v_(ij)) whenever v_(ij) is not null, andw_(ij)=null whenever v_(ij)=null, wherein in the matrix F the symbolε_(ij), where ε_(ij)=1 when y_(ij) is not null, and ε_(ij)=0 when y_(ij)is null, wherein $y_{ij} = \left\{ {\begin{matrix}{{x_{ij} - \mu};} & {\left( {i,j} \right) \in S} \\{{null};} & {\left( {i,j} \right) \notin S}\end{matrix},} \right.$  wherein x_(ij) denotes entries of a logarithmmatrix, X, wherein a set, S=$\left\{ \left( {i,j} \right) \middle| {x_{ij} \neq {null}} \right\},\mspace{14mu}{{{wherein}\mspace{14mu}\mu} = {\frac{1}{o(S)}{\sum\limits_{{({i,j})} \in S}{x_{ij}.}}}}$9. A system of claim 8 wherein the gap includes at least one of a dataerror and a data omission, and the transaction data comprises dataregarding frequency of transactions during periods of time.
 10. A systemcomprising: means for receiving transaction data; means for determininga gap in the transaction data; means for determining a dominant patternin the transaction data; means for identifying a region within thedominant pattern that corresponds to the gap in the transaction data;and means for adopting data associated with the corresponding regioninto the gap to minimize impact on the dominant pattern and wherein themeans for determining a gap incorporates an algorithm that includes aformula for output z_(ij)=Round(exp(y_(ij))), wherein the output z_(ij)corresponds to filling in the gap, wherein (i, j) refers to a j^(th) dayof an i^(th) week, for n weeks with m days in each week, whereiny_(ij)=w_(ij) whenever w_(ij) is not null and otherwise, y_(ij)=K+u_(ij), wherein K includes an average of w_(ij)−u_(ij) over each (i, j)entry where w_(ij) is not null, wherein C_(j+1)=C_(j+c) _(*j), whereC₁=0, wherein R_(i+1)=R_(i)+r_(i*), where R₁=0, whereinu_(ij)=R_(i)+C_(j), wherein a matrix of column differences, c_(ij),includes c_(ij)=w_(ij+1)−w_(ij) whenever both w_(ij+1) and w_(ij) arenot null, and c_(ij)=null, otherwise, wherein a matrix of rowdifferences, r_(ij), includes r_(ij)=w_(i+1j)−w_(ij) whenever bothw_(i+1j) and w_(ij) are not null, and r_(ij)=null, otherwise, wherein aj^(th) column of c_(ij) includes at least one non-null entry, and c_(*j)includes an average of each non-null entry in the j^(th) column ofc_(ij), otherwise, c_(*j)=0, wherein an i^(th) row of r_(ij) includes atleast one non-null entry, and r_(i*) includes an average of eachnon-null entry in the i^(th) row of r_(ij), otherwise, r_(i*)=0, whereinx_(ij) includes valid numerical data, and if the data is not valid on(i, j), x_(ij)=null, wherein v_(ij) includes v_(ij)=x_(ij), unlessx_(ij)=0,in which case, v_(ij)=null, wherein includes w_(ij)=ln(v_(ij))whenever v_(ij) is not null, and w_(ij)=null whenever v_(ij)=null.